-
Notifications
You must be signed in to change notification settings - Fork 796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EKS Prow Cluster] Add karpenter terraform module for eks-prow-build cluster #6895
Conversation
/assign @xmudrii |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: koksay The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@@ -81,6 +82,27 @@ locals { | |||
} | |||
] | |||
|
|||
karpenter_roles = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we use cluster access entry instead of adding to the aws-auth configmap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. I tried this code on the Canary cluster first and could not run Karpenter with the access entry. I may have missed something, so I turned it off and continued with the aws-auth file.
cluster_name = module.eks.cluster_name | ||
create_access_entry = false | ||
|
||
enable_irsa = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not use EKS Pod Identity here?
rolearn = "arn:aws:iam::468814281478:role/AWSReservedSSO_AdministratorAccess_abaef4db15a2c055" | ||
username = "sso-admins" | ||
groups = [ | ||
"eks-cluster-admin" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what permissions this group maps to, but similar to above - would cluster access entry work here instead?
interruptionQueue: Karpenter-prow-canary-cluster | ||
serviceAccount: | ||
annotations: | ||
eks.amazonaws.com/role-arn: "arn:aws:iam::054318140392:role/KarpenterController-20240527081538529900000002" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EKS Pod Identity would remove this hardcoded role arn mapping in code
serviceAccount: | ||
annotations: | ||
# terraform state show module.karpenter.aws_iam_role.controller\[0\] | grep " arn " | ||
eks.amazonaws.com/role-arn: arn:aws:iam::468814281478:role/KarpenterController-20240527081538529900000002 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment - EKS Pod Identity removes this hardcoding
@@ -90,6 +101,12 @@ data "aws_iam_policy_document" "eks_apply" { | |||
"logs:PutRetentionPolicy", | |||
"logs:TagLogGroup", | |||
"s3:PutObject", | |||
"sqs:createqueue", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these permissions required by prow, or by Karpenter? Likewise fir the Eventbridge API permissions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are required by the karpenter module, it creates a SQS queue for the event messaging.
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
"events:DescribeRule", | ||
"events:ListTagsForResource", | ||
"events:ListTargetsByRule", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be part of the plan policy, we only keep the write permissions here.
"sqs:getqueueattributes", | ||
"sqs:listqueuetags", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be part of the plan policy, we only keep the write permissions here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are those permissions lowercase while all other permissions are CamelCase? Is that on the AWS side or AWS doesn't care at all?
|
||
# default affinity rule works well in our case | ||
# https://github.com/aws/karpenter-provider-aws/blob/main/charts/karpenter/values.yaml#L70 | ||
tolerations: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to force scheduling Karpenter on the stable nodes?
|
||
# default affinity rule works well in our case | ||
# https://github.com/aws/karpenter-provider-aws/blob/main/charts/karpenter/values.yaml#L70 | ||
tolerations: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, we should check if we want to force Karpenter on the stable nodes.
@@ -65,6 +64,7 @@ module "vpc" { | |||
public_subnet_tags = { | |||
"kubernetes.io/role/elb" = 1 | |||
"kubernetes.io/cluster/${var.cluster_name}" = "owned" | |||
"karpenter.sh/discovery" = var.cluster_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a comment why is this required?
Superseded by #7063 |
/close |
@xmudrii: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Make sure to update the
serviceAccount.annotations
field in theinfra/aws/terraform/prow-build-cluster/resources/karpenter/flux-hr-karpenter.yaml
file (also ininfra/aws/terraform/prow-build-cluster/resources/karpenter/prod-cluster-values
):There will be a follow-up PR to add nodepool and nodeclass configration.